Modified Self-organizing Maps for Line Extraction in Digitized Text Documents
نویسندگان
چکیده
. Different authors have developed modifications of the Kohonen Self-Organizing Maps to solve known combinatorial optimization problems. In this paper a modification of the Kohonen Map is proposed to solve the detection of white inter-text spaces in a digitized plain text documents. The idea relies on the fact that line extraction problem has several features which match easily with Kohonen networks, although an adaptation to the problem of the original learning rule has to be made at first. A test with different digitized text images is performed showing the ability to segment lines.
منابع مشابه
Corporate Decision Making with Self-Organizing Patent Maps Labeled by Technical Terms and AHP
In this paper, we propose an approach for corporate decision making with self-organizing patent maps labeled by technical terms and AHP. First, we select the patent area of interest and collect pertinent patent documents in text format. Second, we extract keywords by text mining to transform patent documents into feature vectors of the companies. Third, we input the feature matrix of technical ...
متن کاملSegmentation of Digitized Mammograms Using Self-Organizing Maps in a Breast Cancer Computer Aided Diagnosis System
The objective of this work is to develop a digitized mammograms’ feature extraction approach using Kohonen’s Self-Organizing Maps (SOM). Once developed, the SOM network will be used as the first processing stage in a breast cancer computer aided diagnosis (CAD) system. Its role will be to offer segmented data as input to a second stage dedicated to the diagnosis task, which will be implemented ...
متن کاملA method for multilingual text mining and retrieval using growing hierarchical self-organizing maps
With the increasing amount of multilingual texts in the Internet, multilingual text retrieval techniques have become an important research issue. However, the discovery of relationships between different languages remains an open problem. In this paper we propose a method, which applied the growing hierarchical self-organizing map (GHSOM) model, to discover knowledge from multilingual text docu...
متن کاملLandforms identification using neural network-self organizing map and SRTM data
During an 11 days mission in February 2000 the Shuttle Radar Topography Mission (SRTM) collected data over 80% of the Earth's land surface, for all areas between 60 degrees N and 56 degrees S latitude. Since SRTM data became available, many studies utilized them for application in topography and morphometric landscape analysis. Exploiting SRTM data for recognition and extraction of topographic ...
متن کاملWord-Streams for Representing Context in Word Maps
The most prominent use of Self-Organizing Maps (SOMs) in text archiving and retrieval is the WEBSOM. In WEBSOM, a map is first used to reduce the dimensionality of the huge term frequency table by training a so-called word-category map. This wordcategory map is then used to convert the individual documents into their respective document signatures (i.e. histogram of words) which form the basis ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003